Skip to content

net, tests, stuntime: Wait for VMI affinity reconciliation before migration#5347

Open
Anatw wants to merge 4 commits into
RedHatQE:mainfrom
Anatw:stuntime_fix_affinity_issue
Open

net, tests, stuntime: Wait for VMI affinity reconciliation before migration#5347
Anatw wants to merge 4 commits into
RedHatQE:mainfrom
Anatw:stuntime_fix_affinity_issue

Conversation

@Anatw

@Anatw Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
What this PR does / why we need it:

When a VM template affinity is updated and migration is triggered immediately after, virt-controller and the VMIM controller race: the VMIM controller may read the VMI before the VM controller has reconciled the template change, creating the migration target pod with stale affinity rules. The VM ends up on the wrong node.

This PR adds wait_for_vmi_affinity() to BaseVirtualMachine, which polls the VMI until its affinity matches the VM template — ensuring the VM controller has reconciled before migration is triggered. All stuntime tests (L2 bridge + localnet) now call it after set_template_affinity.

Additionally, temporary post-migration affinity assertions are added (gated behind is_jira_open("CNV-90576")) to detect the race if it still occurs despite the wait. These assertions will be removed once CNV-90576 is resolved.

The first commit removes a redundant set_template_affinity call from test_server_migrates_between_non_client_nodes — the preceding test already sets the same affinity and @pytest.mark.incremental preserves it.

Which issue(s) this PR fixes:
Special notes for reviewer:
jira-ticket:

NONE

Summary by CodeRabbit

  • Bug Fixes
    • Improved template affinity handling so VM affinity changes are confirmed before continuing.
    • Added stronger waiting and verification around live migration scenarios to reduce flaky results.
    • Updated migration tests to check affinity state before and after moving VMs in several common cases.
    • Removed an outdated affinity setup from one migration scenario.
    • Added clearer logging when a VM migration completes.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3cea1e74-af00-4652-aac0-10fe4f5d1067

📥 Commits

Reviewing files that changed from the base of the PR and between f5ce311 and e012eb6.

📒 Files selected for processing (4)
  • libs/vm/vm.py
  • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
  • tests/network/localnet/migration_stuntime/test_migration_stuntime.py
  • utilities/virt.py

📝 Walkthrough

Walkthrough

This PR adds retry-based waiting for VMI affinity reconciliation in BaseVirtualMachine.set_template_affinity, introduces a helper _expected_vmi_affinity that polls and compares VMI spec affinity, updates l2_bridge and localnet migration stuntime tests to wait for/validate affinity around migration, removes some anti-affinity setup calls, and adds a migration node logging statement.

Changes

Affinity-aware migration stuntime checks

Layer / File(s) Summary
VMI affinity wait helper
libs/vm/vm.py
set_template_affinity computes expected_affinity via _filter_out_none_values and calls a new @retry-decorated _expected_vmi_affinity helper that polls self.vmi.spec.affinity for up to 10s, raising ValueError on mismatch.
Migration node logging
utilities/virt.py
migrate_vm_and_verify logs source and destination node names after migration verification.
L2 bridge migration test sequencing
tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
Import block reformatted for affinity assertion support; four tests now wait for VMI affinity before migration and conditionally assert post-migration affinity (gated by is_jira_open("CNV-90576")); an explicit anti-affinity setup call is removed from the non-client-nodes scenario.
Localnet migration test sequencing
tests/network/localnet/migration_stuntime/test_migration_stuntime.py
Four tests gain a new step between template affinity setup and migration verification; the non-client-nodes scenario has its set_template_affinity(...) call removed.

Estimated code review effort: 3 (Moderate) | ~20 minutes

Possibly related PRs

Suggested reviewers: EdDev, yossisegev, orelmisan, nirdothan, vsibirsk, frenzyfriday

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is under 120 characters and clearly describes the main change: waiting for VMI affinity reconciliation before migration.
Description check ✅ Passed The description follows the required template and explains the change, though the issue and reviewer sections are left blank.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stp Link Required ✅ Passed No new test files or new def test_* functions were added; the diff only edits existing tests, so the STP-link requirement doesn’t apply.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Linked repositories: Your configuration references 1 linked repositories, but your current plan allows 0. Analyzed ``, skipped RedHatQE/openshift-virtualization-tests-design-docs.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-virtualization-qe-bot-6

Copy link
Copy Markdown

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message
  • /security-override - Set security check runs to pass (maintainers only)
  • /security-override cancel - Re-run security checks

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3
  • /cherry-pick-retry <branch> - Retry a failed cherry-pick (merged PRs only)

Branch Management

  • /rebase - Rebase this PR branch onto its base branch

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • EdDev
  • dshchedr
  • myakove
  • rnetser
  • vsibirsk

Reviewers:

  • Anatw
  • EdDev
  • RoniKishner
  • azhivovk
  • dshchedr
  • frenzyfriday
  • nirdothan
  • orelmisan
  • rnetser
  • servolkov
  • vsibirsk
  • yossisegev
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
AI Features
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6-1m)
Security Checks
  • Suspicious Path Detection: Monitors paths: .claude/, .vscode/, .cursor/, .devcontainer/, .pi/, .github/workflows/, .github/actions/
  • Committer Identity Check: Verifies last committer matches PR author
  • Mandatory: Security checks block merge (use /security-override to bypass — maintainers only)

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@Anatw

Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

/wip

@openshift-virtualization-qe-bot-4 openshift-virtualization-qe-bot-4 changed the title net, tests, stuntime: Wait for VMI affinity reconciliation before migration WIP: net, tests, stuntime: Wait for VMI affinity reconciliation before migration Jun 23, 2026
@Anatw

Anatw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

/build-and-push-container

@openshift-virtualization-qe-bot-4

Copy link
Copy Markdown

New container for quay.io/openshift-cnv/openshift-virtualization-tests:pr-5347 published

@yossisegev yossisegev left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Anat

Comment thread libs/vm/vm.py Outdated
Comment thread libs/vm/vm.py Outdated
expected_affinity = (
asdict(obj=template_affinity, dict_factory=self._filter_out_none_values) if template_affinity else None
)
for sample in TimeoutSampler(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a helper with @retry

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Comment thread tests/network/libs/stuntime.py Outdated
Comment on lines +164 to +168
has_stuntime_rules = any(
expr.get("key", "").startswith("stuntime.")
for rule in pod_affinity[stale_type].get("requiredDuringSchedulingIgnoredDuringExecution", [])
for expr in rule.get("labelSelector", {}).get("matchExpressions", [])
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rules are a bit unreadable
Consider splitting it to vars

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, let me know if this is better or I should split further.

@openshift-virtualization-qe-bot-3

Copy link
Copy Markdown
Contributor

/retest all

Auto-triggered: Files in this PR were modified by merged PR #5285.

Overlapping files

libs/vm/vm.py

@Anatw

Anatw commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

Changes:

  • set_template_affinity - Replace TieoutSampler with Retry.
  • assert_affinity_after_migration - split into variables.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py`:
- Around line 83-86: The cleanup for the `active_ping` session in
`measure_stuntime()` is only happening after the Jira-gated
`assert_affinity_after_migration()` path, so a failure there can leave the ping
running. Move the ping stop/report logic out of the normal flow and into a
`finally` block or fixture teardown around the migration/assertion sequence in
the stuntime migration tests so it always runs, even when
`assert_affinity_after_migration()` fails. Apply the same pattern in the related
`test_migration_stuntime` scenarios that use `wait_for_vmi_affinity()`,
`migrate_vm_and_verify()`, and `measure_stuntime()`.

In `@tests/network/libs/stuntime.py`:
- Around line 158-160: The membership check in the pod affinity assertion can
crash when vm.vmi.virt_launcher_pod.instance.to_dict()["spec"].get("affinity")
returns None, so update the diagnostic logic in this helper to handle the
missing-affinity case before checking expected_type in pod_affinity. Use a
pytest assertion in stuntime.py that explicitly verifies affinity is present and
then checks for the expected and stale types, with a clear failure message that
explains which affinity state was missing or unexpected.

In `@tests/network/localnet/migration_stuntime/test_migration_stuntime.py`:
- Around line 83-86: The migration stuntime tests are asserting affinity before
the ping session is guaranteed to be stopped, so a failure can leave active_ping
running and leak state into later migrations. Update the affected test blocks
around migrate_vm_and_verify(), assert_affinity_after_migration(), and
measure_stuntime() so ping cleanup is unconditional, using a finally block or
fixture teardown to stop/report the ping even if the affinity assertion fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c139bd34-f5f3-4fab-aba9-96a051950522

📥 Commits

Reviewing files that changed from the base of the PR and between d503c68 and f5ce311.

📒 Files selected for processing (4)
  • libs/vm/vm.py
  • tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
  • tests/network/libs/stuntime.py
  • tests/network/localnet/migration_stuntime/test_migration_stuntime.py

Comment thread tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py Outdated
Comment thread tests/network/libs/stuntime.py Outdated
Comment thread tests/network/localnet/migration_stuntime/test_migration_stuntime.py Outdated

@azhivovk azhivovk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks
I have a few questions

Comment thread libs/vm/vm.py
Comment thread tests/network/libs/stuntime.py Outdated
@Anatw

Anatw commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

/verified

Tested on bm03-tlv2
Openshift version: 4.22.0-rc.2
CNV version: 4.22.0
HCO image: brew.registry.redhat.io/rh-osbs/iib:1158778

uv run pytest -s --skip-deprecated-api-test --pdbcls=IPython.terminal.debugger:TerminalPdb --tc-format=python --cluster-sanity-skip-storage-check -m "tier3" tests/network/localnet/migration_stuntime/test_migration_stuntime.py tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
==== 24 passed, 1 deselected, 237 warnings in 2642.62s (0:44:02) ====

@openshift-virtualization-qe-bot-3

Copy link
Copy Markdown
Contributor

@coderabbitai

Test execution plan request details

CRITICAL: You MUST post an inline review comment on the first changed line of the first file.
The inline comment should contain the full Test Execution Plan (smoke decision, gating decision, and specific affected tests).
Do NOT submit a blocking review event (REQUEST_CHANGES/APPROVE).
Post a single inline PR comment on Files Changed (non-blocking COMMENT flow).

As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan.
You will post an inline review comment with the test execution plan on the first changed file.
If you fail to run or post a comment, retry.

Analysis Requirements:

  1. Examine code changes in each modified file

  2. Identify affected code paths, functions, and classes

  3. Analyze pytest-specific elements: fixtures (scope, dependencies), parametrization, markers, conftest changes

  4. Trace test dependencies through imports, shared utilities, fixture inheritance, fixture teardown, and yield from cleanup in conftest

  5. Detect new tests introduced in the PR

  6. Utilities and libs impact (when utilities/ or libs/ changes):
    You MUST use shell scripts (rg, git diff) to trace the full impact.
    Follow these sub-steps in order:

    6a. Identify modified symbols: For each changed file under utilities/ or libs/,
    list every modified function or method.
    Example: git diff HEAD~1 --unified=0 -- utilities/hco.py | grep '^[+-]def '

    6b. Find direct callers: Search tests and conftest for each symbol from 6a.
    Example: rg -l 'get_hco_version' tests/

    6c. Trace fixture teardown and cleanup: Find fixtures that reach
    the modified symbol through yield from or context-manager wrappers.
    Example: rg -l 'yield from.*enable_common_boot|def.*enable_common_boot' tests/

    6d. Trace same-file callers: In each changed file, find other functions
    whose body calls a modified symbol (including code after yield
    in @contextmanager helpers).
    Example: rg 'get_hco_version|enable_common_boot' utilities/hco.py

    6e. Expand transitively: If function A calls modified B, then
    tests/fixtures that call A are affected — even when the test body
    never imports B directly.

    Do NOT limit impact to tests that import the modified symbol only.

  7. Smoke test impact: Intersect the affected set from step 6 with smoke-marked tests.
    Run: rg -l '@pytest.mark.smoke' tests/
    VERIFY the above command returned actual file paths before concluding False.
    Set True if either condition is met:

    • a smoke-marked file appears in the affected set from 6b-6e, OR
    • any conftest.py in the smoke test's parent-directory hierarchy (up to repo root)
      imports or calls a modified utilities/libs symbol — including autouse fixtures
      that depend on modified functions. ALL tests in that directory and below are affected.
      Example check: for each smoke_file, scan dirname(smoke_file)/conftest.py,
      dirname(dirname(smoke_file))/conftest.py, etc. for modified symbol imports
      and autouse fixtures that depend on modified symbols.
  8. Gating test impact: Intersect the affected set from step 6 with gating-marked tests.
    Run: rg -l '@pytest.mark.gating' tests/
    Set True if a gating-marked file also appears in the affected set from 6b-6e.
    Utilities/libs changes often affect gating tests without affecting smoke tests.
    Do NOT stop analysis after concluding Run smoke tests: False.

Output rules:
Do NOT include analysis step numbers (1-8) in your visible output.

Your deliverable:
Your inline informational comment will be based on the following requirements:

Test Execution Plan

  • Run smoke tests: True / False — If True, state the dependency path (test → fixture → changed symbol). True ONLY with a verified path.
  • Run gating tests: True / False — If True, state the dependency path. True if any gating-marked test is in the affected set.
  • Affected tests to run (required when utilities/, libs/, or shared conftest changes — list concrete paths even when smoke is False)

Use these formats:

  • path/to/test_file.py - When the entire test file needs verification
  • path/to/test_file.py::TestClass::test_method - When specific test(s) needed
  • path/to/test_file.py::test_function - When specific test(s) needed
  • -m marker - When a marker covers multiple affected tests (e.g. -m gating only if ALL gating tests in scope need run)
  • Tag each listed test or group with its marker when not obvious, e.g. (gating) or (smoke)

Real test commands (MANDATORY when changes affect session/runtime code):

When the affected code runs at session/collection time (conftest fixtures, pytest plugins,
config hooks, session-scoped setup) or modifies runtime behavior that unit tests mock away,
you MUST include concrete pytest commands the PR author must run on a real cluster
to verify the change works end-to-end. Include:

  • A command for the error/fix path (the scenario the PR fixes)
  • A command for the happy path (regression: the normal case still works)
  • Use lightweight tests (e.g., --collect-only for startup failures,
    a single small test for runtime behavior)
    If the PR only changes test logic (not utilities/libs/conftest), the affected test
    paths themselves serve as the real test commands — no separate section needed.

Example output for a session-startup fix:

**Real tests (cluster required)**
Error path (the fix):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=nonexistent-sc --collect-only`
Expected: ValueError with clear message, not IndexError

Happy path (regression):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=<valid-sc> -k test_bar`
Expected: session starts normally

Guidelines:

  • Include tests affected directly OR via fixture setup/teardown, yield from cleanup, or transitive utility call chains (caller calls modified helper)
  • Use a full file path only if ALL tests in that file require verification
  • Use file path + test name when only specific tests use an affected fixture or utility wrapper (preferred for partial file impact)
  • If a test marker can cover multiple files/tests, provide the marker
  • Balance coverage vs over-testing - Keep descriptions minimal
  • Example: if leaf helper foo() changes, include tests whose fixture teardown calls wrapper bar() where bar() calls foo(), even when the test body only imports an unrelated symbol from the same utilities module

Hardware-Related Checks (SR-IOV, GPU, DPDK):

When PR modifies fixtures for hardware-specific resources:

  • Collection Safety: Fixtures MUST have existence checks (return None when hardware unavailable)
  • Test Plan: MUST verify both WITH and WITHOUT hardware:
    • Run affected tests on cluster WITH hardware
    • Verify collection succeeds on cluster WITHOUT hardware

CRITICAL WORKFLOW COMPLETION RULES:

When responding to this test execution plan request, you MUST follow these rules EXACTLY:

  1. YOUR ONLY DELIVERABLE: Post one non-blocking inline comment containing the test execution plan on the first changed line
  2. THEN STOP IMMEDIATELY - Do NOT generate any additional response
  3. FALLBACK ONLY: If inline comment API calls fail after retrying, post as a regular PR comment
  4. SILENCE = SUCCESS: After successfully submitting the review, your task is complete. No confirmation needed.

ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):

  • ❌ Do NOT post acknowledgment messages like "Test execution plan posted", "Review posted successfully", "I've successfully posted"
  • ❌ Do NOT mention review IDs, URLs, or confirmation of posting in the PR thread
  • ❌ Do NOT add any follow-up comments after submitting the review
  • ❌ Do NOT reply to confirm task completion
  • ❌ Do NOT explain what you did - just do it and stop

Remember: The pull request review is visible to users. Additional comments are redundant noise.

CRITICAL — Verification rules for smoke and gating decisions:
- For each True decision, you must have traced a concrete path: test file → fixture → utility function → modified symbol.
- Do NOT assume impact based on scope, file proximity, or infrastructure-sounding names.
Only set True when you trace a concrete path through steps 6a-6e, then 7 or 8.
WRONG: "utilities/hco.py changed → run smoke" (no traced dependency path)
WRONG: "Run smoke tests: False" with no affected-test list when gating tests use the changed utility
RIGHT: "Run smoke tests: False, Run gating tests: True" plus tests/.../test_foo.py::TestBar::test_baz (gating)
RIGHT: "Smoke test X uses fixture Y → wrapper Z → changed function W in utilities/hco.py"

@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

echo "== PR head SHA =="
head_sha=$(gh pr view "$pr" --repo "$repo" --json headRefOid --jq '.headRefOid')
echo "$head_sha"

echo
echo "== Changed files in order =="
mapfile -t pr_files < <(gh pr view "$pr" --repo "$repo" --json files --jq '.files[].path')
printf '%s\n' "${pr_files[@]}"

first_file="${pr_files[0]}"
echo
echo "== First changed file =="
echo "$first_file"

echo
echo "== Patch for first changed file (head) =="
gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="'"$first_file"'") | .patch'

echo
echo "== Current diff for changed files =="
for f in "${pr_files[@]}"; do
  echo "--- $f ---"
  git diff --unified=3 "origin/$(git rev-parse --abbrev-ref HEAD)~1" -- "$f" 2>/dev/null || git diff --unified=3 HEAD~1 -- "$f" || true
done

echo
echo "== Outline for candidate library/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    ast-grep outline "$f" --view expanded | sed -n '1,240p'
  fi
done

echo
echo "== Modified function/method definitions in libs/utilities/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    git diff --unified=0 HEAD~1 -- "$f" | rg '^[+-]\s*(def |class )' || true
  fi
done

echo
echo "== Direct callers/imports in tests and conftest =="
rg -n -C2 '\bwait_for_vmi_affinity\s*\(|\bassert_affinity_after_migration\s*\(|\bset_template_affinity\s*\(' tests tests/**/conftest.py || true

echo
echo "== Same-file callers in changed library/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    rg -n -C2 'wait_for_vmi_affinity|assert_affinity_after_migration|_assert_vmi_affinity|_filter_out_none_values' "$f" || true
  fi
done

echo
echo "== Changed test files: markers and test defs =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f markers ---"
    rg -n '`@pytest`\.mark\.(smoke|gating|tier[0-9]+|polarion|parametrize)' "$f" || true
    echo "--- $f tests ---"
    rg -n '^\s*def test_' "$f" || true
  fi
done

echo
echo "== Smoke-marked files =="
rg -l '`@pytest`\.mark\.smoke' tests || true

echo
echo "== Gating-marked files =="
rg -l '`@pytest`\.mark\.gating' tests || true

echo
echo "== Parent conftest references to modified symbols =="
rg -n -C2 'wait_for_vmi_affinity|assert_affinity_after_migration|set_template_affinity' tests/**/conftest.py tests/conftest.py || true

echo
echo "== First changed line in first changed file =="
first_line=$(
  gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="'"$first_file"'") | .patch' \
  | python - <<'PY'
import sys,re
patch=sys.stdin.read().splitlines()
cur=None
for line in patch:
    m=re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@', line)
    if m:
        cur=int(m.group(1))
        continue
    if cur is None:
        continue
    if line.startswith('+') and not line.startswith('+++'):
        print(cur)
        break
    if line.startswith('-') and not line.startswith('---'):
        continue
    cur += 1
PY
)
echo "$first_line"

body="$(cat <<'EOF'
**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test or parent `conftest.py` to `libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity()` or `tests/network/libs/stuntime.py::assert_affinity_after_migration()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)

These are the concrete affected paths because the changed tests now call `BaseVirtualMachine.wait_for_vmi_affinity()` directly after `set_template_affinity()`, and they conditionally call `tests/network/libs/stuntime.py::assert_affinity_after_migration()` after `migrate_vm_and_verify()`.

**Real tests (cluster required)**

Error path (the fix):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: after `set_template_affinity()`, the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

Happy path (regression):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'`
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when `CNV-90576` is open, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.
EOF
)"

echo
echo "== Attempt inline review comment posting =="
set +e
inline_ok=0
for attempt in 1 2 3; do
  echo "-- inline attempt $attempt --"
  gh api \
    --method POST \
    "repos/$repo/pulls/$pr/comments" \
    -f body="$body" \
    -f commit_id="$head_sha" \
    -f path="$first_file" \
    -F line="$first_line" \
    -f side="RIGHT"
  rc=$?
  echo "inline_rc=$rc"
  if [ $rc -eq 0 ]; then
    inline_ok=1
    break
  fi
  sleep 2
done

if [ $inline_ok -ne 1 ]; then
  echo
  echo "== Inline failed; attempt fallback regular PR comment =="
  gh api \
    --method POST \
    "repos/$repo/issues/$pr/comments" \
    -f body="$body"
  echo "fallback_rc=$?"
fi
set -e

Length of output: 7215


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT

echo "== PR head SHA =="
head_sha="$(gh pr view "$pr" --repo "$repo" --json headRefOid --jq '.headRefOid')"
echo "$head_sha"

echo
echo "== Changed files in order =="
gh pr view "$pr" --repo "$repo" --json files --jq '.files[].path' > "$tmpdir/pr_files.txt"
cat "$tmpdir/pr_files.txt"

first_file="$(sed -n '1p' "$tmpdir/pr_files.txt")"
echo
echo "== First changed file =="
echo "$first_file"

echo
echo "== Patch for first changed file (head) =="
gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="'"$first_file"'") | .patch' > "$tmpdir/first_patch.txt"
cat "$tmpdir/first_patch.txt"

echo
echo "== Current diff for changed files =="
while IFS= read -r f; do
  echo "--- $f ---"
  git diff --unified=3 HEAD~1 -- "$f" || true
done < "$tmpdir/pr_files.txt"

echo
echo "== Outline for candidate library/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    ast-grep outline "$f" --view expanded | sed -n '1,240p'
  fi
done

echo
echo "== Modified function/method definitions in libs/utilities/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    git diff --unified=0 HEAD~1 -- "$f" | rg '^[+-]\s*(def |class )' || true
  fi
done

echo
echo "== Direct callers/imports in tests and conftest =="
rg -n -C2 '\bwait_for_vmi_affinity\s*\(|\bassert_affinity_after_migration\s*\(|\bset_template_affinity\s*\(' tests tests/**/conftest.py || true

echo
echo "== Same-file callers in changed library/shared files =="
for f in libs/vm/vm.py tests/network/libs/stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f ---"
    rg -n -C2 'wait_for_vmi_affinity|assert_affinity_after_migration|_assert_vmi_affinity|_filter_out_none_values' "$f" || true
  fi
done

echo
echo "== Changed test files: markers and test defs =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  if [ -f "$f" ]; then
    echo "--- $f markers ---"
    rg -n '`@pytest`\.mark\.(smoke|gating|tier[0-9]+|polarion|parametrize)' "$f" || true
    echo "--- $f tests ---"
    rg -n '^\s*def test_' "$f" || true
  fi
done

echo
echo "== Smoke-marked files =="
rg -l '`@pytest`\.mark\.smoke' tests || true

echo
echo "== Gating-marked files =="
rg -l '`@pytest`\.mark\.gating' tests || true

echo
echo "== Parent conftest references to modified symbols =="
rg -n -C2 'wait_for_vmi_affinity|assert_affinity_after_migration|set_template_affinity' tests/**/conftest.py tests/conftest.py || true

echo
echo "== First changed line in first changed file =="
python - "$tmpdir/first_patch.txt" <<'PY' > "$tmpdir/first_line.txt"
import re, sys
patch = open(sys.argv[1], encoding="utf-8").read().splitlines()
cur = None
for line in patch:
    m = re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@', line)
    if m:
        cur = int(m.group(1))
        continue
    if cur is None:
        continue
    if line.startswith('+') and not line.startswith('+++'):
        print(cur)
        break
    if line.startswith('-') and not line.startswith('---'):
        continue
    cur += 1
PY
first_line="$(cat "$tmpdir/first_line.txt")"
echo "$first_line"

cat > "$tmpdir/body.md" <<'EOF'
**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test or parent `conftest.py` to `libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity()` or `tests/network/libs/stuntime.py::assert_affinity_after_migration()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)

These are the concrete affected paths because the changed tests now call `BaseVirtualMachine.wait_for_vmi_affinity()` directly after `set_template_affinity()`, and they conditionally call `tests/network/libs/stuntime.py::assert_affinity_after_migration()` after `migrate_vm_and_verify()`.

**Real tests (cluster required)**

Error path (the fix):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: after `set_template_affinity()`, the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

Happy path (regression):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'`
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when `CNV-90576` is open, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.
EOF

body="$(cat "$tmpdir/body.md")"

echo
echo "== Attempt inline review comment posting =="
set +e
inline_ok=0
for attempt in 1 2 3; do
  echo "-- inline attempt $attempt --"
  gh api \
    --method POST \
    "repos/$repo/pulls/$pr/comments" \
    -f body="$body" \
    -f commit_id="$head_sha" \
    -f path="$first_file" \
    -F line="$first_line" \
    -f side="RIGHT"
  rc=$?
  echo "inline_rc=$rc"
  if [ "$rc" -eq 0 ]; then
    inline_ok=1
    break
  fi
  sleep 2
done

if [ "$inline_ok" -ne 1 ]; then
  echo
  echo "== Inline failed; attempt fallback regular PR comment =="
  gh api \
    --method POST \
    "repos/$repo/issues/$pr/comments" \
    -f body="$body"
  echo "fallback_rc=$?"
fi
set -e

Length of output: 46117


@openshift-virtualization-qe-bot-3

CRITICAL: I retried the required workflow on the latest PR head (f5ce3111e1b1c4723faef11205437b536846b0b2) and could not post the comment because the GitHub integration does not have comment-write permission on #5347.

Why this matters: your workflow requires a non-blocking inline PR comment on libs/vm/vm.py, Line 13. Without write access, neither the primary path nor the fallback can complete.

Posting attempts

  • Inline review comment on libs/vm/vm.py, Line 13: failed 3 times
  • Fallback regular PR comment: failed
  • Error each time: HTTP 403: Resource not accessible by integration

Here is the exact comment body to post manually on libs/vm/vm.py, Line 13:

**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test or parent `conftest.py` to `libs/vm/vm.py::BaseVirtualMachine.wait_for_vmi_affinity()` or `tests/network/libs/stuntime.py::assert_affinity_after_migration()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_off_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_client_migrates_to_server_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_off_client_node` (tier3)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py::TestMigrationStuntime::test_server_migrates_to_client_node` (tier3)

These are the concrete affected paths because the changed tests now call `BaseVirtualMachine.wait_for_vmi_affinity()` directly after `set_template_affinity()`, and they conditionally call `tests/network/libs/stuntime.py::assert_affinity_after_migration()` after `migrate_vm_and_verify()`.

**Real tests (cluster required)**

Error path (the fix):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: after `set_template_affinity()`, the VMI affinity converges before migration, the migrated VM lands with matching anti-affinity rules, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

Happy path (regression):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_to_server_node or test_server_migrates_to_client_node'`
Expected: affinity-targeted migrations still succeed, the target pod reflects the expected affinity type when `CNV-90576` is open, and stuntime stays within `STUNTIME_THRESHOLD_SECONDS`.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 30, 2026

@azhivovk azhivovk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully it's okay to apply W/A for the stuntime measruements

Comment thread libs/vm/vm.py Outdated
self._assert_vmi_affinity(expected_affinity=expected_affinity)

@retry(wait_timeout=10, sleep=1)
def _assert_vmi_affinity(self, expected_affinity: dict[str, Any] | None) -> bool:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:
Since we use assert for tests, consider renaming to expected_vmi_affinity

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me, let not mix up concerns. assert is about test itself, if you have to check some state in precondition/step, please use any other way/naming, not assert.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, renamed to _expected_vmi_affinity

Comment thread libs/vm/vm.py Outdated
@retry(wait_timeout=10, sleep=1)
def _assert_vmi_affinity(self, expected_affinity: dict[str, Any] | None) -> bool:
vmi_affinity = self.vmi.instance.to_dict()["spec"].get("affinity")
assert vmi_affinity == expected_affinity, (

@servolkov servolkov Jun 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see thread. assert is a test concern, not library. Raise a clear exception instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

Comment thread tests/network/libs/stuntime.py Outdated
return stuntime


def assert_affinity_after_migration(vm: BaseVirtualMachine, expected_type: str) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is not a test itself, rename the helper and don't use assert. I bet we don't test affinity after migration, hence it does not deserve assertion. Otherwise, it is a mess.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is now redundant - removed entirely.

localnet_stuntime_client_vm.set_template_affinity(affinity=new_pod_anti_affinity(label=SERVER_VM_LABEL))
localnet_stuntime_client_vm.wait_for_vmi_affinity()
migrate_vm_and_verify(vm=localnet_stuntime_client_vm, client=admin_client)
if is_jira_open(jira_id="CNV-90576"):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw don't we need to check affinity even without this bug? I mean, perhaps the direction is right regardless the bug -> checking jira is not needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly - the bug is an automation bug, so the product bug this code is dependent on is not relevant anymore.

To your point - I was conflicting about this too, but I think that we shouldn't test this regardless - we expect the product to behave as it should - for example, if we set an NNCP and it is reported as being configured successfully - we assume it is not lying - we don't check the node to make sure the bridge is actually there.
But I'm not 100% convinced, I'm curious on @EdDev take on this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what is asked here.
I can just say that updating the affinity in order to control to where the VM will migrate to is a test thing. I do not expect users to use it, but they may... in any case we do not test it in this context.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @EdDev , the question is whether we want to verify the affinity post migration, or we assume it is what we asked for. The bug exposed that migration can end successfully with stale affinity - making our stuntime measurements incorrect.

Comment thread libs/vm/vm.py Outdated
patches = {self: {"spec": {"template": {"spec": {"affinity": template_affinity}}}}}
ResourceEditor(patches=patches).update()

def wait_for_vmi_affinity(self) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per my understanding, the API contract of "set affinity" should mean "affinity is applied when I return." Hence the wait operation should be a part if set_template_affinity. Otherwise, every caller must remember the wait, and future callers may forget. Consider integrating the reconciliation wait into set_template_affinity itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, absolutely.

Anatw and others added 3 commits July 1, 2026 15:42
virt-controller and VMIM controller reconcile independently — if
migration is triggered before the VM controller propagates the template
affinity to the VMI, the target pod is created with stale scheduling
rules.

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Anat Wax <awax@redhat.com>
Signed-off-by: Anat Wax <awax@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
@Anatw

Anatw commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Changes:

  • Add a log manifesting the location of the pod before and after a migration.
  • Wait for the affinity inside set_template_affinity.
  • Rename _assert_vmi_affinity.
  • In _assert_vmi_affinity through an exception instead of an assert.
  • Remove checks of the bug (it turned out to be an automation bug, not a product bug).

@Anatw

Anatw commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

/verified

Tested on bm03-tlv2
Openshift version: 4.22.0-rc.2
CNV version: 4.22.0
HCO image: brew.registry.redhat.io/rh-osbs/iib:1158778

uv run pytest -s --skip-deprecated-api-test --pdbcls=IPython.terminal.debugger:TerminalPdb --tc-format=python --cluster-sanity-skip-storage-check -m "tier3" tests/network/localnet/migration_stuntime/test_migration_stuntime.py tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py
==== 24 passed, 1 deselected, 245 warnings in 2528.49s (0:42:08) ====

@openshift-virtualization-qe-bot-3

Copy link
Copy Markdown
Contributor

@coderabbitai

Test execution plan request details

CRITICAL: You MUST post an inline review comment on the first changed line of the first file.
The inline comment should contain the full Test Execution Plan (smoke decision, gating decision, and specific affected tests).
Do NOT submit a blocking review event (REQUEST_CHANGES/APPROVE).
Post a single inline PR comment on Files Changed (non-blocking COMMENT flow).

As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan.
You will post an inline review comment with the test execution plan on the first changed file.
If you fail to run or post a comment, retry.

Analysis Requirements:

  1. Examine code changes in each modified file

  2. Identify affected code paths, functions, and classes

  3. Analyze pytest-specific elements: fixtures (scope, dependencies), parametrization, markers, conftest changes

  4. Trace test dependencies through imports, shared utilities, fixture inheritance, fixture teardown, and yield from cleanup in conftest

  5. Detect new tests introduced in the PR

  6. Utilities and libs impact (when utilities/ or libs/ changes):
    You MUST use shell scripts (rg, git diff) to trace the full impact.
    Follow these sub-steps in order:

    6a. Identify modified symbols: For each changed file under utilities/ or libs/,
    list every modified function or method.
    Example: git diff HEAD~1 --unified=0 -- utilities/hco.py | grep '^[+-]def '

    6b. Find direct callers: Search tests and conftest for each symbol from 6a.
    Example: rg -l 'get_hco_version' tests/

    6c. Trace fixture teardown and cleanup: Find fixtures that reach
    the modified symbol through yield from or context-manager wrappers.
    Example: rg -l 'yield from.*enable_common_boot|def.*enable_common_boot' tests/

    6d. Trace same-file callers: In each changed file, find other functions
    whose body calls a modified symbol (including code after yield
    in @contextmanager helpers).
    Example: rg 'get_hco_version|enable_common_boot' utilities/hco.py

    6e. Expand transitively: If function A calls modified B, then
    tests/fixtures that call A are affected — even when the test body
    never imports B directly.

    Do NOT limit impact to tests that import the modified symbol only.

  7. Smoke test impact: Intersect the affected set from step 6 with smoke-marked tests.
    Run: rg -l '@pytest.mark.smoke' tests/
    VERIFY the above command returned actual file paths before concluding False.
    Set True if either condition is met:

    • a smoke-marked file appears in the affected set from 6b-6e, OR
    • any conftest.py in the smoke test's parent-directory hierarchy (up to repo root)
      imports or calls a modified utilities/libs symbol — including autouse fixtures
      that depend on modified functions. ALL tests in that directory and below are affected.
      Example check: for each smoke_file, scan dirname(smoke_file)/conftest.py,
      dirname(dirname(smoke_file))/conftest.py, etc. for modified symbol imports
      and autouse fixtures that depend on modified symbols.
  8. Gating test impact: Intersect the affected set from step 6 with gating-marked tests.
    Run: rg -l '@pytest.mark.gating' tests/
    Set True if a gating-marked file also appears in the affected set from 6b-6e.
    Utilities/libs changes often affect gating tests without affecting smoke tests.
    Do NOT stop analysis after concluding Run smoke tests: False.

Output rules:
Do NOT include analysis step numbers (1-8) in your visible output.

Your deliverable:
Your inline informational comment will be based on the following requirements:

Test Execution Plan

  • Run smoke tests: True / False — If True, state the dependency path (test → fixture → changed symbol). True ONLY with a verified path.
  • Run gating tests: True / False — If True, state the dependency path. True if any gating-marked test is in the affected set.
  • Affected tests to run (required when utilities/, libs/, or shared conftest changes — list concrete paths even when smoke is False)

Use these formats:

  • path/to/test_file.py - When the entire test file needs verification
  • path/to/test_file.py::TestClass::test_method - When specific test(s) needed
  • path/to/test_file.py::test_function - When specific test(s) needed
  • -m marker - When a marker covers multiple affected tests (e.g. -m gating only if ALL gating tests in scope need run)
  • Tag each listed test or group with its marker when not obvious, e.g. (gating) or (smoke)

Real test commands (MANDATORY when changes affect session/runtime code):

When the affected code runs at session/collection time (conftest fixtures, pytest plugins,
config hooks, session-scoped setup) or modifies runtime behavior that unit tests mock away,
you MUST include concrete pytest commands the PR author must run on a real cluster
to verify the change works end-to-end. Include:

  • A command for the error/fix path (the scenario the PR fixes)
  • A command for the happy path (regression: the normal case still works)
  • Use lightweight tests (e.g., --collect-only for startup failures,
    a single small test for runtime behavior)
    If the PR only changes test logic (not utilities/libs/conftest), the affected test
    paths themselves serve as the real test commands — no separate section needed.

Example output for a session-startup fix:

**Real tests (cluster required)**
Error path (the fix):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=nonexistent-sc --collect-only`
Expected: ValueError with clear message, not IndexError

Happy path (regression):
`pytest tests/storage/.../test_foo.py --storage-class-matrix=<valid-sc> -k test_bar`
Expected: session starts normally

Guidelines:

  • Include tests affected directly OR via fixture setup/teardown, yield from cleanup, or transitive utility call chains (caller calls modified helper)
  • Use a full file path only if ALL tests in that file require verification
  • Use file path + test name when only specific tests use an affected fixture or utility wrapper (preferred for partial file impact)
  • If a test marker can cover multiple files/tests, provide the marker
  • Balance coverage vs over-testing - Keep descriptions minimal
  • Example: if leaf helper foo() changes, include tests whose fixture teardown calls wrapper bar() where bar() calls foo(), even when the test body only imports an unrelated symbol from the same utilities module

Hardware-Related Checks (SR-IOV, GPU, DPDK):

When PR modifies fixtures for hardware-specific resources:

  • Collection Safety: Fixtures MUST have existence checks (return None when hardware unavailable)
  • Test Plan: MUST verify both WITH and WITHOUT hardware:
    • Run affected tests on cluster WITH hardware
    • Verify collection succeeds on cluster WITHOUT hardware

CRITICAL WORKFLOW COMPLETION RULES:

When responding to this test execution plan request, you MUST follow these rules EXACTLY:

  1. YOUR ONLY DELIVERABLE: Post one non-blocking inline comment containing the test execution plan on the first changed line
  2. THEN STOP IMMEDIATELY - Do NOT generate any additional response
  3. FALLBACK ONLY: If inline comment API calls fail after retrying, post as a regular PR comment
  4. SILENCE = SUCCESS: After successfully submitting the review, your task is complete. No confirmation needed.

ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):

  • ❌ Do NOT post acknowledgment messages like "Test execution plan posted", "Review posted successfully", "I've successfully posted"
  • ❌ Do NOT mention review IDs, URLs, or confirmation of posting in the PR thread
  • ❌ Do NOT add any follow-up comments after submitting the review
  • ❌ Do NOT reply to confirm task completion
  • ❌ Do NOT explain what you did - just do it and stop

Remember: The pull request review is visible to users. Additional comments are redundant noise.

CRITICAL — Verification rules for smoke and gating decisions:
- For each True decision, you must have traced a concrete path: test file → fixture → utility function → modified symbol.
- Do NOT assume impact based on scope, file proximity, or infrastructure-sounding names.
Only set True when you trace a concrete path through steps 6a-6e, then 7 or 8.
WRONG: "utilities/hco.py changed → run smoke" (no traced dependency path)
WRONG: "Run smoke tests: False" with no affected-test list when gating tests use the changed utility
RIGHT: "Run smoke tests: False, Run gating tests: True" plus tests/.../test_foo.py::TestBar::test_baz (gating)
RIGHT: "Smoke test X uses fixture Y → wrapper Z → changed function W in utilities/hco.py"

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"
tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT

echo "== PR head SHA =="
head_sha="$(gh pr view "$pr" --repo "$repo" --json headRefOid --jq '.headRefOid')"
echo "$head_sha"

echo
echo "== Changed files in order =="
gh pr view "$pr" --repo "$repo" --json files --jq '.files[].path' > "$tmpdir/pr_files.txt"
cat "$tmpdir/pr_files.txt"

first_file="$(sed -n '1p' "$tmpdir/pr_files.txt")"
echo
echo "== First changed file: $first_file =="

echo
echo "== Patch for first changed file =="
gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="'"$first_file"'") | .patch' > "$tmpdir/first_patch.txt"
cat "$tmpdir/first_patch.txt"

echo
echo "== Current libs/vm/vm.py content around affinity code =="
rg -n -C5 'def set_template_affinity|def wait_for_vmi_affinity|def _assert_vmi_affinity|_expected_vmi_affinity' libs/vm/vm.py || true

echo
echo "== Full diff for all changed files vs previous merge base =="
while IFS= read -r f; do
  echo "--- $f ---"
  git diff --unified=3 HEAD~1 -- "$f" || true
done < "$tmpdir/pr_files.txt"

echo
echo "== First changed line number =="
python - "$tmpdir/first_patch.txt" <<'PY'
import re, sys
patch = open(sys.argv[1], encoding="utf-8").read().splitlines()
cur = None
for line in patch:
    m = re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@', line)
    if m:
        cur = int(m.group(1))
        continue
    if cur is None:
        continue
    if line.startswith('+') and not line.startswith('+++'):
        print(cur)
        break
    if line.startswith('-') and not line.startswith('---'):
        continue
    cur += 1
PY

echo
echo "== Attempt inline comment on updated head =="
first_line=$(python - "$tmpdir/first_patch.txt" <<'PY'
import re, sys
patch = open(sys.argv[1], encoding="utf-8").read().splitlines()
cur = None
for line in patch:
    m = re.match(r'^@@ -\d+(?:,\d+)? \+(\d+)(?:,\d+)? @@', line)
    if m:
        cur = int(m.group(1))
        continue
    if cur is None:
        continue
    if line.startswith('+') and not line.startswith('+++'):
        print(cur)
        break
    if line.startswith('-') and not line.startswith('---'):
        continue
    cur += 1
PY
)
echo "first_line=$first_line"

cat > "$tmpdir/body.md" <<EOF
Test execution plan placeholder for verification of API access.
EOF
body="$(cat "$tmpdir/body.md")"

set +e
gh api \
  --method POST \
  "repos/$repo/pulls/$pr/comments" \
  -f body="$body" \
  -f commit_id="$head_sha" \
  -f path="$first_file" \
  -F line="$first_line" \
  -f side="RIGHT"
echo "inline_rc=$?"
set -e

Length of output: 7925


🏁 Script executed:

#!/bin/bash
set -euo pipefail

repo="RedHatQE/openshift-virtualization-tests"
pr="5347"

echo "== Full diff of test files across entire PR =="
for f in tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py; do
  echo "--- $f ---"
  gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="'"$f"'") | .patch'
done

echo
echo "== Full diff of utilities/virt.py =="
gh api "repos/$repo/pulls/$pr/files?per_page=100" --jq '.[] | select(.filename=="utilities/virt.py") | .patch'

echo
echo "== Callers of set_template_affinity, wait_for_vmi_affinity, assert_affinity_after_migration =="
rg -n -C2 'set_template_affinity|wait_for_vmi_affinity|assert_affinity_after_migration' tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py tests/network/localnet/migration_stuntime/test_migration_stuntime.py

echo
echo "== smoke/gating markers globally =="
rg -l '`@pytest`\.mark\.smoke' tests || true
rg -l '`@pytest`\.mark\.gating' tests || true

Length of output: 15662


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Any smoke/gating test files call set_template_affinity? =="
rg -l 'set_template_affinity' tests || true

echo
echo "== Confirm set_template_affinity only used in stuntime tests =="
rg -n 'set_template_affinity' tests

Length of output: 2926


@openshift-virtualization-qe-bot-3

CRITICAL: Re-ran the full workflow against the latest PR head (e012eb60108921037b1a61027adfbcb7c8248b6d). The inline review comment attempt on libs/vm/vm.py, Line 13 failed again with the same error:

{"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/pulls/comments#create-a-review-comment-for-a-pull-request","status":"403"}

The fallback issue-comment path is equally blocked by the same integration permission scope. This is a permissions issue on the GitHub App/integration token, not an analysis failure — write access to PR review comments/issue comments is required and currently absent.

Updated analysis reflects the latest changes: set_template_affinity() in libs/vm/vm.py now internally waits via a new @retry-decorated _expected_vmi_affinity() (10s timeout, 1s sleep) that raises ValueError on mismatch — the earlier stand-alone wait_for_vmi_affinity() call and the temporary CNV-90576 post-migration assertions were removed from the test files (turned out to be an automation bug, not a product bug).

Here is the exact comment body to post manually on libs/vm/vm.py, Line 13:

**Test Execution Plan**

- **Run smoke tests: False** — no verified dependency path from any `@pytest.mark.smoke` test to `libs/vm/vm.py::BaseVirtualMachine.set_template_affinity()` / `_expected_vmi_affinity()`.
- **Run gating tests: False** — no verified dependency path from any `@pytest.mark.gating` test to the changed symbols.

- **Affected tests to run**
  - `tests/network/l2_bridge/migration_stuntime/test_migration_stuntime.py` (tier3, all cases use `set_template_affinity()`)
  - `tests/network/localnet/migration_stuntime/test_migration_stuntime.py` (tier3, all cases use `set_template_affinity()`)
  - `tests/network/l2_bridge/rhel9_rhel10_cluster/test_connectivity.py` (also calls `set_template_affinity()`, now waits for VMI reconciliation and can raise `ValueError` on timeout)
  - `tests/network/localnet/rhel9_rhel10_cluster/test_connectivity.py` (also calls `set_template_affinity()`, same reasoning)

These are the concrete affected paths because `set_template_affinity()` now blocks on VMI affinity reconciliation (`_expected_vmi_affinity`, retry wait_timeout=10s) before returning, and can raise a `ValueError` if the VMI never reconciles — every caller of this method is exposed to this new behavior.

**Real tests (cluster required)**

Error path (the fix — race window closed):
`pytest -s -o log_cli=true -m tier3 --jira tests/network/localnet/migration_stuntime/test_migration_stuntime.py -k 'test_client_migrates_off_server_node or test_server_migrates_off_client_node'`
Expected: `set_template_affinity()` blocks until VMI affinity matches the template before migration starts; no stale-affinity migration target pod.

Happy path (regression, non-stuntime caller):
`pytest -s -o log_cli=true tests/network/l2_bridge/rhel9_rhel10_cluster/test_connectivity.py`
Expected: `set_template_affinity()` calls still complete normally within the 10s retry window; no `ValueError` timeouts introduced.

@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.66%. Comparing base (a2dc143) to head (e012eb6).
⚠️ Report is 197 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5347      +/-   ##
==========================================
- Coverage   98.67%   98.66%   -0.02%     
==========================================
  Files          25       42      +17     
  Lines        2487     2467      -20     
==========================================
- Hits         2454     2434      -20     
  Misses         33       33              
Flag Coverage Δ
utilities 98.66% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@EdDev EdDev left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virt-controller and VMIM controller reconcile independently

I think you meant VM controller and VMIM controller ...

Comment thread libs/vm/vm.py
expected_affinity = (
asdict(obj=affinity, dict_factory=self._filter_out_none_values) if affinity else None
)
self._expected_vmi_affinity(expected_affinity=expected_affinity)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work by just using template_affinity (instead of expected_affinity)?

localnet_stuntime_client_vm.set_template_affinity(affinity=new_pod_anti_affinity(label=SERVER_VM_LABEL))
localnet_stuntime_client_vm.wait_for_vmi_affinity()
migrate_vm_and_verify(vm=localnet_stuntime_client_vm, client=admin_client)
if is_jira_open(jira_id="CNV-90576"):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what is asked here.
I can just say that updating the affinity in order to control to where the VM will migrate to is a test thing. I do not expect users to use it, but they may... in any case we do not test it in this context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.